---
title: "Cleaning V-Party data"
---

# Load

```{r}
# load packages
  source("helper-packages.R")

# load Vpart Dataset
  vpart_raw <- 
    import("../raw-data/x-vparty/V-Dem-CPD-Party-V1.dta", encoding = "UTF-8")
```

# Clean

* v2parelig  
Question: To what extent does this party invoke God, religion, or sacred/religious texts to justify its
positions?
Responses:
0: Always, or almost always. The party almost always invokes God, religion, or sacred/religious
texts to justify its positions. 
1: Often, but not always. The party often, but not always, invokes God, religion, or religious
texts to justify its positions.
2: About half of the time. The party about half of the time invokes God, religion, or religious
texts to justify its positions.
3: Rarely. The party rarely invokes God, religion, or religious texts to justify its positions.
4: Never. The party never invokes God, religion, or religious texts to justify its positions.

Selection information:
"Background on the Varieties of Party Identity and Organization Dataset (V-Party)
 In January 2020, 665 carefully selected country experts have assessed the identity of the political parties in their country of expertise with
a vote share of more than 5% in a legislative election between 1970 and 2019 across 169 countries. This generates a dataset on 1,955 political parties across 1,560 elections--- or in total 6,330 party-election year units with 183,570 expert-coded data points. Typically, at least four
locally-based experts contributed to each question. Coder responses were aggregated using V-Dem’s custom-built statistical model to
ensure comparability across countries and time."
https://www.v-dem.net/static/website/img/refs/vparty_briefing.pdf

```{r}
# clean
  vpart_clean <- 
    vpart_raw %>% 
    mutate(
      
  # generate country name      
      vpart_common_country_name = countryname(country_name),
      
  # generate a variable that take 1 if a party always or often espouses religion    
           always_often_religion = 
             case_when(
               v2parelig_ord %in% c(0, 1) ~ 1, 
               v2parelig_ord %in% c(2, 3, 4) ~ 0, 
               TRUE ~ NA_real_)) %>%

  # what is the share of religious parties by country/election year?    
    group_by(year, vpart_common_country_name) %>%
      summarise(mean_prop_of_parties_religious = mean(always_often_religion, na.rm = T)) %>% 
    ungroup() %>%
  
  # create a binary variable that is 1 when the religious party share is non-zero (and zero otherwise; except where there is no information whatever on parties)
    mutate(religious_party_in_nat_assembly = (mean_prop_of_parties_religious > 0)*1) %>% 
  
  # remove full-missingness cases
    filter(!is.na(religious_party_in_nat_assembly)) %>% 
  
  # order the dataframe
    arrange(vpart_common_country_name, year)
    
# expand dataframe to make annual panel
  vpart_clean_expanded <- 
    expand.grid(vpart_common_country_name = unique(vpart_clean$vpart_common_country_name), year = 1950:2022) %>% 
      arrange(vpart_common_country_name, year) %>% 
    left_join(vpart_clean, by = c("vpart_common_country_name", "year")) %>% 
    group_by(vpart_common_country_name) %>% 
      fill(religious_party_in_nat_assembly) %>% 
    ungroup() %>% 
    filter(!is.na(religious_party_in_nat_assembly)) %>% 
    select(
      vparty_country_common = vpart_common_country_name,
      vparty_year = year,
      vparty_religious_party_in_nat_assembly = religious_party_in_nat_assembly)
```

# Save data

```{r}
  saveRDS(vpart_clean_expanded, "../cleaned-data/x-3-vparty.rds")
```
